Genetic confirmation of a hybrid between two highly divergent cardinalid species: A rose‐breasted grosbeak (Pheucticus ludovicianus) and a scarlet tanager (Piranga olivacea)

Abstract Using low‐coverage whole‐genome sequencing, analysis of vocalizations, and inferences from natural history, we document a first‐generation hybrid between a rose‐breasted grosbeak (Pheucticus ludovicianus) and a scarlet tanager (Piranga olivacea). These two species occur sympatrically throughout much of eastern North America, although were not previously known to interbreed. Following the field identification of a putative hybrid, we use genetic and bioacoustic data to show that a rose‐breasted grosbeak was the maternal parent and a scarlet tanager was the paternal parent of the hybrid, whose song was similar to the latter species. These two species diverged >10 million years ago, and thus it is surprising to find a hybrid formed under natural conditions in the wild. Notably, the hybrid has an exceptionally heterozygous genome, with a conservative estimate of a heterozygous base every 100 bp. The observation that this hybrid of such highly divergent parental taxa has survived until adulthood serves as another example of the capacity for hybrid birds to survive with an exceptionally divergent genomic composition.

of reproductive isolation (e.g., Rothfels et al., 2015). Moreover, in avian systems, research on the coloration patterns observed in hybrids between divergent parents has been used to learn about the inheritance of plumage and song traits (Williamson et al., 2021).
Hybridization among bird species, in particular, is known to be common (Grant & Grant, 1992), yet the majority of this occurs between very closely related species and within hybrid zones.
Here, we apply genomic and bioacoustic analyses to document the first described hybrid between two highly divergent species, rose-breasted grosbeak (Pheucticus ludovicianus) and scarlet tanager (Piranga olivacea), which occur sympatrically throughout much of eastern North America. Both species are members of the Cardinalidae family and have not previously been known to hybridize. Moreover, based on the time-calibrated phylogeny of Barker et al. (2015), they last shared a common ancestor >10 million years ago. While postzygotic incompatibilities have been shown to take much longer in birds (Fitzpatrick, 2004), overall reproductive isolation is generally thought to be complete after approximately 2-4 million years in high-latitude avian species pairs where this has been studied (Price, 2008;Weir et al., 2015), making this naturally occurring, wild hybrid unusual. We use genomic data, combined with song recording, to confirm field assessment of the parental species, as well as quantify genome-wide patterns of heterozygosity.

| Observation
On June 6, 2020, in Lawrence County, Pennsylvania, S.M.G. heard a song that he took to be a scarlet tanager. He searched for the bird in order to take a photograph, which instead looked like male rosebreasted grosbeak but with marked differences in plumage and morphology ( Figure 1). On June 7, 2020, R.M. successfully re-located the singing bird and mist netted it using an audio lure of tanager song. Plumage differences distinguishing the individual from a typical male rose-breasted grosbeak included its black wings and tail without white markings, yellowish white underwings instead of pink, and a pink instead of black throat. It also had a small concealed pale yellow crown patch (Figure 1a-d). Morphological differences included a longer primary projection and a more elongated, shallower bill that was darker and more gray-green than the pink-ivory bill of a rose-breasted grosbeak. Its bill lacked a tomial tooth, a characteristic of Piranga tanagers. S.L. then extracted 5-10 μl of blood by venipuncture from the ulnar vein in the wing, which was then stored on Whatman™ filter paper. R.M. and S.L. then collected standard morphological measurements of the bird (Table 1) We used Raven Pro Sound Analysis Software v1.5 (K. Lisa Yang Center for Conservation Bioacoustics, 2014) to assess the characteristics of the n = 25 of recorded songs, excluding the two songs captured in the counter-singing interaction. For each recording, we generated a spectrogram, a visual representation of the sound with time on the horizontal axis, frequency on the vertical axis, and amplitude (or "loudness") represented by the darkness of the pixel ( Figure 2). We used the annotation feature of Raven Pro to identify the time and frequency boundaries of the syllables within the recorded songs. We then compared measurements of the songs' frequency range, number of syllables, and duration of the putative hybrid's song to those of previously published measurements of those attributes in scarlet tanager and rose-breasted grosbeak songs (

| Genomic methods
To estimate the genetic ancestry of the putative hybrid, we used low-coverage whole-genome sequencing as in Toews et al. (2020).
We first extracted DNA from the blood sample obtained from the putative hybrid, using Qiagen DNAeasy spin columns and following manufacturer protocol (Qiagen). We then generated short-read We compared read data of the hybrid with previously published short-read sequence data deposited in the NCBI Short Read Archive (SRA) to confirm the maternal parent species and to identify the putative paternal parent. The comparison species for the putative parents of the hybrid (a rose-breasted grosbeak, Pheucticus ludovicianus, and a scarlet tanager, Piranga olivacea) were chosen based on preliminary mitochondrial DNA (mtDNA) sequencing, morphological similarity, and qualitative song characteristics. The closest available complementary short-read datasets that included both parental genera (as of June 2021) were derived from an RNA-seq study of blood investigating hemosporidian parasites (Galen et al., 2020).
This included the rose-breasted grosbeak (Pheucticus ludovicianus; SRA Accession #SAMN11263484) and the western tanager (Piranga ludoviciana; SRA Accession #SAMN11263491; MSB:Bird:47847), but not the specific putative paternal parent species, the scarlet tanager (Piranga olivacea). Western tanagers do not occur in the eastern USA where the hybrid was reported, and beyond a scarlet tanager, the only other member of the Piranga genus that occurs in the region where the hybrid was reported is Piranga rubra (summer tanager).
However, we were able to use genetic data to identify genus-level assignment for the paternal parent and use other inferences to assign species-level identity (see below).
We used AdapterRemoval (Lindgreen, 2012) to collapse overlapping read pairs and trim low-quality bases from read ends. We aligned these reads to the only high-quality cardinalid genome publicly available, from the northern cardinal (Cardinalis cardinalis; GenBank assembly accession # GCA_014549065.1; Sin et al., 2020) using BowTie2 (Langmead & Salzberg, 2012). We added to this assembly the full mitochondrial genome sequence from a separate C.
cardinalis individual (NCBI GenBank accession #MH700631) to facilitate the alignment of mtDNA reads. For the data from the putative hybrid, we used read-pair information and set the maximum distance between pairs (the -X flag) to 700 bp. For the RNA-seq data of the parental taxa, we did not include read pair information (as read pairs could span large and unpredictable intron junctions; i.e., we input each with the -U flag). We estimated mapped read coverage with We compared the sequence of hybrid reads to the parental species from an arbitrary portion of the C. cardinalis genome that was from a large scaffold with sufficient coverage from all three species (scaffold JACDOX010000102, between 30,488 and 40,944 bp). We extracted the sequence using the "mpileup" command of Samtools (Li et al., 2009) and compared the sequences in Geneious v11.0.3.
This region aligns with the leucyl/cystinyl aminopeptidase (LNPEP) gene in the Ficedula albicollis (FicAlb 1.5; GenBank accession # GCA_000247815.2) genome assembly (Ellegren et al., 2012). The intermediacy of the hybrid was overwhelmingly supported by all genomic regions investigated, and thus we report the results of only this region to illustrate the hybridization patterns ( Figure 3).
We also quantified global genome-wide heterozygosity of the putative hybrid's genome using genotype likelihoods in ANGSD (v0.934; Korneliussen et al., 2014) with the "-dosaf 1" flag to generate the site frequency spectrum. We then used this information to estimate the fraction of the genome with heterozygous sites.

| RE SULTS
In the hand, we confirmed that the putative hybrid was an 1-year-old male based on its wing molt limits (Mulvihill, 1993) and cloacal protuberance. Morphometric comparisons (Table 1)  Qualitative spectrographic analysis of two of the vocalization recordings illustrated that the individual's song and call were comparable to those typical of scarlet tanagers, but not of rose-breasted grosbeaks. In two recordings, the bird sang 25 bouts of song and 1 partial song/call. The hybrid's song had a "burry" tone produced by rapid frequency modulation; a quality typical of scarlet tanager but not rose-breasted grosbeak (Figure 2). This quality is visible as a wide bandwidth sound on a low-resolution spectrogram; in contrast, the tonal sound of a rose-breasted grosbeak appears as a thin line.
On high-resolution spectrograms, this quality can be resolved as a rapidly oscillating thin tone ( Figure S1). Additionally, in the middle of one song, the putative hybrid produced a "chick-burr" vocalization, highly similar to the same vocalization made by scarlet tanagers.
Quantitative analysis of the recordings confirmed that the individual's song was within the range of scarlet tanager, but largely dissimilar to that of rose-breasted grosbeak. The number of syllables within the songs varied between 4 and 6 (mean ± SD: 4.96 ± 0.68; n = 25), which is within the typical range of scarlet tanager but fewer than the average ~10 syllables of rose-breasted grosbeak ( Table 2).
The duration of the song varied from 1.28 s to 2.30 s (mean ± SD: 1.79 ± 0.25; n = 25), which was within the typical range for scarlet tanager but shorter than that of rose-breasted grosbeak ( Table 2).
The full frequency range of the hybrid song was 1.2 kHz-5.9 kHz.
The reported typical ranges for scarlet tanager (2.2-5.5 kHz) and rose-breasted grosbeak (1.5-5 kHz) both fall within this range, making frequency range an uninformative feature for this identification.

| DISCUSS ION
The combination of evidence-visual, bioacoustic, and geneticconfirms that the parents of the described individual were a rose-breasted grosbeak Pheucticus ludovicianus (female parent) and a scarlet tanager Piranga olivacea (male parent). While these two species breed sympatrically across much of eastern North America, they exhibit somewhat different habitat preferences: scarlet tanagers typically prefer unfragmented, mature forest, while rose-breasted grosbeaks often will occupy second growth including forest with a relatively open canopy, although they will utilize adjacent edges or disturbed areas (Mowbray, 2020;Wyatt & Francis, 2020). The two species are phenotypically highly divergent and have likely not shared a common ancestor in >10 million years (Barker et al., 2015).
Our qualitative and quantitative analyses of the song showed that the vocalizations of this individual were highly similar to those of scarlet tanager and largely dissimilar to those of rose-breasted grosbeak. This individual's rapidly frequency-modulated song and "chick-burr" call were qualitatively very similar to the scarlet tanager's song and call, whereas rose-breasted grosbeaks do not produce rapidly frequency-modulated songs or "chick-burr" calls. In addition, the average number of syllables per song and the song duration were within range of the scarlet tanager song but exceeded that of the rose-breasted grosbeak song.
In addition to the analysis described above, we also used the "Merlin" sound identification mobile application from the Cornell Lab of Ornithology to evaluate our identification. This algorithm was trained on curated song recordings deposited in the Macaulay

Library and can identify over 400 species by vocalization in North
America. When playing the hybrid's song recording for the software, the program invariably identified it as a scarlet tanager, in line with our more detailed assessment of song characteristics described above. We note, however, that the trained model, architecture, and underlying data of the Merlin Sound ID feature have not been published, the classifier accuracy has not been described in the literature, and uncertainty of individual classifications is unreported, preventing more detailed comment on the context and implications of this result. Shy (1984) found that scarlet tanagers lack regional dialects, suggesting that this species learns its song in its first breeding season instead of at its natal site. The similarity between the syllables of this bird's song and that of a counter-singing scarlet tanager suggests that it may have learned its song from its paternal parent or nearby neighbors at this breeding location. Hand-reared rose-breasted grosbeaks are unable to sing correctly, suggesting a critical developmental period in this species (Dunham, 1966) but it is unknown how the singing that the bird is exposed to in this critical period correlates with the song ultimately learned by the individual.
The genome of the hybrid was exceptionally heterozygous (Figures 3 and 4)-as is expected from an F1 hybrid with highly divergent parents-with a heterozygous base every 100-150 bp. This is also a likely underestimate. First, given that the parental genera were represented by RNA sequence data, the only regions we analyzed in depth here were coding regions, and these regions are constrained by stronger purifying selection than non-coding sequences (Ward & Kellis, 2012). Second, accurately calling heterozygous sites requires high coverage (Song et al., 2016); thus, we presume that many of the sites that differed between the parental genera but where the hybrid had one or the other genotype (i.e., was not heterozygous), might actually be heterozygous in the hybrid, but we lack the coverage depth to decisively call a heterozygous genotype. The fact that the sites where the hybrid had one or the other parental genotype occur in nearly equal frequencies (24 vs. 26 sites of 137) supports this interpretation.
We also note that while our comparison dataset of low-coverage warbler genomes did not explicitly include any known hybrids, the which diverged approximately 12 mya (Sun et al., 2017). A hybrid between Aglaiocercus kingii x Metallura tyrianthina, known as the "Rogitama hummingbird," was originally described by Stiles and Cortés-Herrera (2015), and further analysis was later provided by An important caveat to our work is that while we were able to determine genetic parentage with very high confidence, our evidence was not 100% confirmed, as we were only able to include nuclear data from a congener for one of the parental taxa. We could have achieved near perfect certainty in confirming parental taxa by including additional sequencing of both parental species. However, the strength of morphological, bioacoustic, and genetic evidence supports that the parents of this hybrid were a rose-breasted grosbeak and a scarlet tanager, and additional sequencing would be unlikely to yield new insight.
Documentation and identification of this hybrid support the utility of low-coverage whole-genome sequencing, particularly when combined with diverse data archives and bioacoustic information, as a straightforward method to assign ancestry for putative hybrid individuals. More generally, the observation that this individualbetween such highly divergent parental taxa-lived until adulthood and behaved like a typical territorial passerine, serves as another example of the survival capacity of birds with exceptionally heterozygous genomes. We note, however, that we could not verify reproduction by this individual hybrid, and a careful search for the bird on territory in 2021 was unsuccessful.

ACK N OWLED G M ENTS
The authors would like to thank the recordists who have contributed recordings to Xeno-Canto and the Macaulay Library of Natural Sounds. DPLT was supported by Pennsylvania State University, and start-up funds from the Eberly College of Science and the Huck Institutes of the Life Sciences.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.